Web Page Classification: A Soft Computing Approach
نویسندگان
چکیده
The Internet makes it possible to share and manipulate a vast quantity of information efficiently and effectively, but the rapid and chaotic growth experienced by the Net has generated a poorly organized environment that hinders the sharing and mining of useful data. The need for meaningful web-page classification techniques is therefore becoming an urgent issue. This paper describes a novel approach to web-page classification based on a fuzzy representation of web pages. A doublet representation that associates a weight with each of the most representative words of the web document so as to characterize its relevance in the document. This weight is derived by taking advantage of the characteristics of HTML language. Then a fuzzy-rule-based classifier is generated from a supervised learning process that uses a genetic algorithm to search for the minimum fuzzy-rule set that best covers the training examples. The proposed system has been demonstrated with two significantly different classes of web pages.
منابع مشابه
A Novel Approach to Feature Selection Using PageRank algorithm for Web Page Classification
In this paper, a novel filter-based approach is proposed using the PageRank algorithm to select the optimal subset of features as well as to compute their weights for web page classification. To evaluate the proposed approach multiple experiments are performed using accuracy score as the main criterion on four different datasets, namely WebKB, Reuters-R8, Reuters-R52, and 20NewsGroups. By analy...
متن کاملExpert Discovery: A web mining approach
Expert discovery is a quest in search of finding an answer to a question: “Who is the best expert of a specific subject in a particular domain within peculiar array of parameters?” Expert with domain knowledge in any field is crucial for consulting in industry, academia and scientific community. Aim of this study is to address the issues for expert-finding task in real-world community. Collabor...
متن کاملFinding the Best page using Synonyms
Rating a page to be a best one, based only on Page Ranking algorithm of Brin and Page would be insufficient. This method relied totally on Link information alone. However, due to application of Soft Computing in Data Mining and Knowledge Discovery, machines were made more effective, additional features of a Page involving its indexing, terms used, capitalizations, anchor texts, hit information,...
متن کاملبررسی تأثیرات رایانش ابری بر یادگیری الکترونیکی
In the world of training, online training is introduced as a modern model of training services. Cloud computing is a modern technology which is provided software, infrastructure and platform as internet. Also, online training is introduced as a modern model of training services on the web. In this research, the impact of cloud computing on e-learning on the case of Mehralborz online university ...
متن کاملClassification of Ultra-high Resolution Images using Soft Computing
The main objective of this paper is to use a computational intelligence algorithm for preparing a mapping map that categorizes different patterns of identification of infected areas and changes in radiation pollution. In this paper, the use of the fuzzy inference system has been proposed to determine the degree of radiation contamination in the regions. The study uses ultra-high resolution spec...
متن کامل